Programming I - Intro to Python

GEOG 30323

September 1, 2015

Why code?

Why code for data analysis?

  • Automation
  • Documentation and reproducibility
  • Logical organization
  • Marketability!

  • A high-level, object-oriented, general purpose programming language
  • Interpreted rather than compiled
  • Rapidly becoming the language of choice for introductory programming courses around the world
  • One of the top languages for data analysis

  • Product of Austin-based Continuum Analytics
  • Over 330 packages for scientific and technical computing with Python
  • Sane package management (you’ll learn more about this next week)
  • Our version: Python 3.4
  • Download link

Why Python? (XKCD)

Source: Randall Munroe/XKCD

Why Python?

In Java, the classic “Hello World” program looks like this:

public class HelloWorld {

    public static void main(String[] args) {
        System.out.println("Hello World");
    }

}

Whereas in Python, you just type:

print("Hello World")

Why Python?

  • Just ask these companies!
Source: Peter Wang - PyData Dallas Keynote

Other options for data analysis

  • R (https://www.r-project.org/): programming language for statistics, data analysis, and much more (and a personal favorite of mine)
  • Julia (http://julialang.org/): relatively new language for technical computing that aims for high-level syntax and C-like speed

Python on the command line

The Jupyter Notebook

  • Browser-based notebooks for literate programming
  • Evolved out of the IPython project
  • Supports multiple languages; “home language” is Python

Launching the Jupyter Notebook

  1. Open your terminal/command prompt; on Windows, search for cmd.exe; on Mac, find it in Utilities –> Applications.
  2. Change into your project directory with cd - this is where you will save your notebooks. Example: cd C:\Users\kylewalker\geog30323. If you need to change drives, add the /D option after cd.
  3. Enter the command ipython notebook, and you are off!

Literate programming

As defined by Donald Knuth:

Literate programming is a methodology that combines a programming language with a documentation language… The main idea is to treat a program as a piece of literature, addressed to human beings rather than to a computer.

Markdown

  • Tool to convert plain text to HTML; used for literate programming in the Jupyter Notebook

Example:

_This link_ is __truly__ must-see: [click here to view it!](http://personal.tcu.edu/kylewalker/)

This link is truly must-see: click here to view it!

Sage words before we get started…

Numbers and strings

  • At a basic level, Python can function like a calculator, or concatenate strings:
>>> 2 + 3
5
>>> 'x' + 'y'
'xy'
  • Object type: the way in which the object is stored (e.g. float, integer, string)
  • Python is a dynamically typed language, which means that you don’t need to explicitly supply the object type

Variables

  • In programming, a variable is a reference to some other sort of information or quantity
  • Variables are created through assignment

Example:

>>> x = 1
>>> print(x)
1

Strings

  • Strings, or textual representations of data, have a series of special methods that allow for their manipulation

Example:

>>> tcu = 'Texas Christian University'
>>> tcu.swapcase()
'tEXAS cHRISTIAN uNIVERSITY'

Lists

  • Data structure in Python for storing multiple values; enclosed in brackets []
  • List elements do not need to all be of the same type (though you’ll often want them to be)

Example list: mylist = [2, 4, 6, 8, 10, 12]

Indexing and slicing

  • Elements in Python can be accessed by position using indexing; covers characters in strings, objects in lists, and much more
  • Python indexing starts at 0 - meaning that the first element is referenced with 0, the second with 1, and so forth
  • Slicing: extract subset a:b starting with position a up to but not including position b

Example:

>>> tcu[0]
'T'
>>> tcu[6:15]
'Christian'